{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "source": [
        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1mBr22Ov8xN6Piy6M38Tr5wOYjpmT_IoH#scrollTo=pNpHQn6FlCL1)"
      ],
      "metadata": {
        "id": "pNpHQn6FlCL1"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Comparing Top 10 LMSYS Models with Portkey\n",
        "\n",
        "---"
      ],
      "metadata": {
        "id": "ynEbjiyQlJat"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "The [LMSYS Chatbot Arena](https://chat.lmsys.org/?leaderboard), with over **1,000,000** human comparisons, is the gold standard for evaluating LLM performance.\n",
        "\n",
        "But, testing multiple LLMs is a ***pain***, requiring you to juggle APIs that all work differently, with different authentication and dependencies.\n",
        "\n",
        "<img src=\"https://portkey.ai/blog/content/images/size/w1600/2024/06/CleanShot-2024-06-06-at-19.48.47@2x.png\" width=700 />\n",
        "\n",
        "**Enter Portkey:** A unified, open source API for accessing over 200 LLMs. Portkey makes it a breeze to call the models on the LMSYS leaderboard - no setup required.\n",
        "\n",
        "---\n",
        "\n",
        "\n",
        "In this notebook, you'll see how Portkey streamlines LLM evaluation for the **Top 10 LMSYS Models**, giving you valuable insights into cost, performance, and accuracy metrics.\n",
        "\n",
        "Let's dive in!\n",
        "\n",
        "---"
      ],
      "metadata": {
        "id": "bUQdnOHYqWLj"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Video Guide\n",
        "The notebook comes with a video guide that you can follow along\n",
        "\n",
        "<a href=\"https://youtu.be/A1ZJV1ML2qI\"><img src=\"https://portkey.ai/blog/content/images/size/w1600/2024/06/CleanShot-2024-06-06-at-21.19-1.png\" width=500 /></a>"
      ],
      "metadata": {
        "id": "i0_L26gIOWmf"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Setting up Portkey\n",
        "\n",
        "To get started, install the necessary packages:"
      ],
      "metadata": {
        "id": "a7sDiU-IGzEm"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "KldJobxHjBNu"
      },
      "outputs": [],
      "source": [
        "!pip install -qU portkey-ai openai"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Next, sign up for a Portkey API key at https://app.portkey.ai/. Navigate to \"Settings\" -> \"API Keys\" and create an API key with the appropriate scope."
      ],
      "metadata": {
        "id": "u281LJpvOhjv"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Defining the Top 10 LMSYS Models\n",
        "\n",
        "Let's define the list of Top 10 LMSYS models and their corresponding providers."
      ],
      "metadata": {
        "id": "tA9Piq_tHYAt"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "top_10_models = [\n",
        "    [\"gpt-4o-2024-05-13\", \"openai\"],\n",
        "    [\"gemini-1.5-pro-latest\", \"google\"],\n",
        "##  [\"gemini-advanced-0514\",\"google\"],             # This model is not available on a public API\n",
        "    [\"gpt-4-turbo-2024-04-09\", \"openai\"],\n",
        "    [\"gpt-4-1106-preview\",\"openai\"],\n",
        "    [\"claude-3-opus-20240229\", \"anthropic\"],\n",
        "    [\"gpt-4-0125-preview\",\"openai\"],\n",
        "##  [\"yi-large-preview\",\"01-ai\"],                  # This model is not available on a public API\n",
        "    [\"gemini-1.5-flash-latest\", \"google\"],\n",
        "    [\"gemini-1.0-pro\", \"google\"],\n",
        "    [\"meta-llama/Llama-3-70b-chat-hf\", \"together\"],\n",
        "    [\"claude-3-sonnet-20240229\", \"anthropic\"],\n",
        "    [\"reka-core-20240501\",\"reka-ai\"],\n",
        "    [\"command-r-plus\", \"cohere\"],\n",
        "    [\"gpt-4-0314\", \"openai\"],\n",
        "    [\"glm-4\",\"zhipu\"],\n",
        "##  [\"qwen-max-0428\",\"qwen\"]                       # This model is not available outside of China\n",
        "]"
      ],
      "metadata": {
        "id": "ZPlY4GC1sBHK"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Add Provider API Keys to Portkey Vault\n",
        "\n",
        "ALL the providers above are integrated with Portkey - which means, you can add their API keys to Portkey vault and get a corresponding **Virtual Key** and streamline API key management.\n",
        "\n",
        "| Provider | Link to get API Key | Payment Mode |\n",
        "| :-- | :-- | :-- |\n",
        "| openai | https://platform.openai.com/ | Wallet Top Up |\n",
        "| anthropic | https://console.anthropic.com/ | Wallet Top Up |\n",
        "| google | https://aistudio.google.com/ | 💰 Free to Use |\n",
        "| cohere | https://dashboard.cohere.com/ | 💰 Free Credits |\n",
        "| together-ai | https://api.together.ai/ | 💰 Free Credits |\n",
        "| reka-ai | https://platform.reka.ai/ | Wallet Top Up |\n",
        "| zhipu | https://open.bigmodel.cn/ | 💰 Free to Use |"
      ],
      "metadata": {
        "id": "QqxZQqQd9DOo"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "## Replace the virtual keys below with your own\n",
        "\n",
        "virtual_keys = {\n",
        "    \"openai\": \"openai-new-c99d32\",\n",
        "    \"anthropic\": \"anthropic-key-a0b3d7\",\n",
        "    \"google\": \"google-66c0ed\",\n",
        "    \"cohere\": \"cohere-ab97e4\",\n",
        "    \"together\": \"together-ai-dada4c\",\n",
        "    \"reka-ai\":\"reka-54f5b5\",\n",
        "    \"zhipu\":\"chatglm-ba1096\"\n",
        "}"
      ],
      "metadata": {
        "id": "CTSM9WO29D88"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Running the Models with Portkey\n",
        "\n",
        "Now, let's create a function to run the Top 10 LMSYS models using OpenAI SDK with Portkey Gateway:"
      ],
      "metadata": {
        "id": "axm5K0Ba_VRJ"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "from openai import OpenAI\n",
        "from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders\n",
        "\n",
        "def run_top10_lmsys_models(prompt):\n",
        "    outputs = {}\n",
        "\n",
        "    for model, provider in top_10_models:\n",
        "        portkey = OpenAI(\n",
        "            api_key = \"dummy_key\",\n",
        "            base_url = PORTKEY_GATEWAY_URL,\n",
        "            default_headers = createHeaders(\n",
        "                api_key=\"YOUR_PORTKEY_API_KEY\",                 # Grab from https://app.portkey.ai/\n",
        "                virtual_key = virtual_keys[provider],\n",
        "                trace_id=\"COMPARING_LMSYS_MODELS\"\n",
        "            )\n",
        "        )\n",
        "\n",
        "        response = portkey.chat.completions.create(\n",
        "            messages=[{\"role\": \"user\", \"content\": prompt}],\n",
        "            model=model,\n",
        "            max_tokens=256\n",
        "        )\n",
        "\n",
        "        outputs[model] = response.choices[0].message.content\n",
        "\n",
        "    return outputs"
      ],
      "metadata": {
        "id": "VmnhzSYivqAz"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Comparing Model Outputs\n",
        "\n",
        "To display the model outputs in a tabular format for easy comparison, we define the print_model_outputs function:"
      ],
      "metadata": {
        "id": "dCwS-eoH_k3U"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "from tabulate import tabulate\n",
        "\n",
        "def print_model_outputs(prompt):\n",
        "    outputs = run_top10_lmsys_models(prompt)\n",
        "\n",
        "    table_data = []\n",
        "    for model, output in outputs.items():\n",
        "        table_data.append([model, output.strip()])\n",
        "\n",
        "    headers = [\"Model\", \"Output\"]\n",
        "    table = tabulate(table_data, headers, tablefmt=\"grid\")\n",
        "    print(table)\n",
        "    print()"
      ],
      "metadata": {
        "id": "Z0y5BPgRvwIC"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Example: Evaluating LLMs for a Specific Task\n",
        "\n",
        "Let's run the notebook with a specific prompt to showcase the differences in responses from various LLMs:\n",
        "\n",
        "On Portkey, you will be able to see the logs for all models:\n",
        "<br /><br />\n",
        "<img src=\"https://portkey.ai/blog/content/images/size/w1600/2024/06/lmsys-2.gif\" width=700 />"
      ],
      "metadata": {
        "id": "zH0jLuLi_qlv"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "prompt = \"If 20 shirts take 5 hours to dry, how much time will 100 shirts take to dry?\"\n",
        "\n",
        "print_model_outputs(prompt)"
      ],
      "metadata": {
        "id": "Cf6XZ0dIvwFv"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Conclusion\n",
        "\n",
        "With minimal setup and code modifications, Portkey enables you to streamline your LLM evaluation process and easily call 200+ LLMs to find the best model for your specific use case.\n",
        "\n",
        "Explore Portkey further and integrate it into your own projects. Visit the Portkey documentation at https://docs.portkey.ai/ for more information on how to leverage Portkey's capabilities in your workflow."
      ],
      "metadata": {
        "id": "6FmQrjR__2yo"
      }
    }
  ]
}